Method and programmable product for unique document identification using stock and content

ABSTRACT

The present application relates to a method for authenticating and tracking of documents. More specifically the present application relates to authenticating and tracking of a document throughout its lifecycle without reliance upon or requirement for any unique identification characters, barcodes and/or objects that were added to the document specifically for the purpose of identification.

RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.60/980,621, filed Oct. 17, 2007 entitled “Method and ProgrammableProduct for Unique Document Identification Using Stock and Content,”U.S. Provisional Application No. 60/908,000, filed Apr. 26, 2007entitled “Apparatus, Method and Program Product for Identification of aDocument with Feature Analysis” and U.S. Provisional Application No.60/951,640, filed Jul. 24, 2007 entitled “Document Processing SystemControl Using Document Feature Analysis for Identification”, thedisclosures of which also are entirely incorporated herein by reference.

TECHNICAL FIELD

The present subject matter relates to a method for authentication andtracking of documents. More specifically the present subject matterrelates to authenticating and tracking of a document throughout itslifecycle without reliance upon or requirement for any uniqueidentification characters, barcodes and/or objects that were added tothe document specifically for the purpose of identification.

BACKGROUND

The need to have technology for authentication and tracking of a paperdocument is becoming a higher priority as security issues abound andtechnology improves in areas that enhance the ability of criminals tomake high quality forgeries. Numerous techniques have been employed toauthenticate a document such as barcodes, water marks, holographicimages, or embossed or raised seals. These techniques do not easilyoffer a different value for each document or each page of a multiplepage document and are more easily defeated.

Radio frequency identification (RFID) technology, other inhomogeneousmedia capable of being interrogated by way of detecting opticalscattering from the material, or optical scanners capable of detectingpaper fiber orientation can yield arbitrarily random results that areextremely improbable to be repeated. RFID is a broad field of technologycovering material or devices that respond to radio frequencyillumination. These devices may include but are not limited to activedevices that radiate a result when interrogated or passive devices thatre-radiate a result when illuminated, wherein the passive devices mayinclude but are not limited to semiconductor devices, material depositedon a substrate, printed material or fibers contained in the paper. Forinstance, paper stock may be embedded accordingly with RFID fibers forunique identification purposes. However, the identification of the paperstock as originated from an authenticated source is insufficient tovalidate a document as the original if the actual content to be markedupon the document is not known. An example is the fraudulent activityknown as check washing, wherein a check marked by a remitter with validamount payable data is washed off using chemical ink removal techniques;the valid amount payable data being subsequently replaced with higher(fraudulent) amount payable data. Even a check having an assigned RFIDsignature would not be protected against instances wherein the hardcopydocument is indeed authentic, but the original content data as markedthereon is not.

Thus, there is a need in the existing art for improved methods formaintaining secure tracking and authentication of documents.

SUMMARY

The teachings herein alleviate one or more of the above noted problemswith document security and tracking and authentication of documents.

One object of the present subject matter is to provide a method ofpreparing a document for later authentication. The document is printedon identifiable stock. The method includes acquiring stockidentification data from a printed hardcopy of the document by a firstsensor coupled with document processing equipment. Content data isobtained for the document and associated with the stock identificationdata. The content data and stock identification data is stored in adatabase.

Another object of the present subject matter is to provide a method ofauthenticating a document printed on identifiable stock. The methodincludes acquiring stock identification data from a printed hardcopy ofthe document by a first sensor coupled with document processingequipment. Content data is obtained from an image of the printedhardcopy of the document by a second sensor coupled with the documentprocessing equipment. The content data and stock identification data arecompared with associated content data and stock identification datastored in a database. An authentication result is returned indicatingwhether or not the content data and stock identification data matcheswith the stored content data and stock identification data in thedatabase.

Yet another object is to provide a method of generating a plurality ofmailpieces containing inserts on document processing equipment for laterauthentication of the inserts. The method includes associating addresseeand/or address data with each of a plurality of inserts printed onidentifiable stock. Stock identification data is acquired from each ofthe plurality of inserts with a sensor. Insert classification data isobtained for the plurality of inserts. The associated address and/oraddressee data, acquired stock identification data and obtained insertclassification data are stored in a database. The mailpieces containingthe insert are generated on the document processing equipment.

Still yet another object of the present subject matter is to provide amethod of authenticating a mailpiece insert printed on identifiablestock. The method includes obtaining stock identification data from themailpiece insert. The stock identification data is compared withassociated stock identification data stored in a database. Addressand/or addressee data stored in the database is gathered based on aresult of the comparing step. Insert classification data associated withthe plurality of mailpieces is acquired from the database. A reportassociating the insert classification data with the obtained addressand/or addressee data is generated.

Additional advantages and novel features will be set forth in part inthe description which follows, and in part will become apparent to thoseskilled in the art upon examination of the following and theaccompanying drawings or may be learned by production or operation ofthe examples. The advantages of the present teachings may be realizedand attained by practice or use of various aspects of the methodologies,instrumentalities and combinations set forth in the detailed examplesdiscussed below.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawing figures depict one or more implementations in accord withthe present teachings, by way of example only, not by way of limitation.In the figures, like reference numerals refer to the same or similarelements.

FIG. 1 Exemplary diagram for collecting document ID data at the point oforigin.

FIG. 2 Exemplary diagram for authenticating a document when it issubsequently observed.

FIGS. 3 a and 3 b Exemplary flow diagrams for document data collectionand document authentication respectively.

FIG. 4 Exemplary diagram for preparing mail pieces that containidentifiable paper stock such as a coupon or plastic card and creating amailpiece.

FIG. 5 Exemplary diagram for processing inserts with stock ID andtracking the addressee that received the items.

FIG. 6 Exemplary flow chart of tracking inserts with stock ID.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are setforth by way of examples in order to provide a thorough understanding ofthe relevant teachings. However, it should be apparent to those skilledin the art that the present teachings may be practiced without suchdetails. In other instances, well known methods, procedures, components,and circuitry have been described at a relatively high-level, withoutdetail, in order to avoid unnecessarily obscuring aspects of the presentteachings.

Reference now is made in detail to the examples illustrated in theaccompanying drawings and discussed below. FIG. 1 illustrates the startof the authenticated high value document creation process 100. All ofthe documents are printed 105 on paper stock that can be uniquelyidentified using sensors and analysis tools of various types 122 such anRFID interrogator/analysis tool capable of detecting and analyzingembedded conductors, using an inhomogeneous media which is interrogatedby a coherent light beam or high magnification imaging capable ofrecognizing paper fibers. RFID is a broad field of technology coveringmaterial or devices that respond to radio frequency illumination. Thesedevices may include but are not limited to active devices that radiate aresult when interrogated or passive devices that re-radiate a resultwhen illuminated, wherein the passive devices may include but are notlimited to semiconductor devices, material deposited on a substrate,printed material or fibers contained in the paper. The completed highvalue document 118 can be any of numerous types such as certificate 110,and contract 112, a check 114 or a coupon 116. If the printed content isnot known from step 100, an imaging sensor 120 coupled with an extractormodule 140 is used to capture an image of and subsequently interpret thecontents of the printed material using standard or advanced OCRtechnology. Concurrently, the paper stock identification is read usingthe stock ID sensors 122 (e.g., RFID analysis tool). Sensors 120 and 122are integrated into a document processor including, but not limited to,a scanner or copier. Other document processing devices are contemplatedand readily understood by those skilled in the art. The output of theextractor module 140 is information about the content of the document132 and its stock ID 130. This information is sent to the central datawarehouse management system 145 where it is combined with metadata aboutthe document 134. The metadata typically contains information about thedocument such as when it was created and what type of document wascreated (a will, deed, stock, mortgage). If the printed content is knownand is transferred to the central data warehouse management system 145,this data may be used instead of the OCR data 132. The process is notdependent on the data transfer 134 from the document generation system100, but added metadata is valuable and OCR errors are eliminated. Allof the collected data 130, 132, 134 is stored in the central datawarehouse 150 for later use during the authentication process shown inFIG. 2.

FIG. 2 illustrates the document authentication process starting with adocument user 200 who has been presented with a document 218 to processand authenticate. In this exemplary example, the document 218 asreceived is one of the high value type as described respective todocument 118 of FIG. 1 above. When presented with a document, thedocument user 200 does not know if the document 218 presented matchesthe original 118 or not. With this in mind, the document 218 is firstprocessed by a document processing system such as a scanner 205 that isequipped with an imaging sensor 220 and a stock ID sensor 222. Coupledto the extractor module 140, the stock ID sensor 222 and imaging sensor220 process the data accordingly, to generate its unique stock ID 230and content data 232, respectively. The central data warehousemanagement system 145, which includes an identification, matching andauthentication techniques commonly used by those skilled in the art tomatch data with entries in a database, such as the central datawarehouse 150 which is used to process the stock ID 230 and content data232 and attempt to find a match in the central data warehouse 150. Sinceboth the stock and content information can be validated—as such data wasplaced accordingly into the database during the time of documentcreation—it is possible to certify that the document is an original andhas not been modified. If modifications to the content are detected,however, these discrepancies will be reported to the document user 200by way of a user interface (not shown). If all data is confirmed, thedocument user 200 may receive the authentication indication along withany metadata 210 relevant to said document 218.

Numerous configurations are possible to accomplish the documentprocessing and authentication tasks described above respective to FIGS.1 and 2. All items and/or processes depicted in exemplary FIGS. 1 and 2can be located centrally (e.g., a government office) or the items and/orprocesses may be distributed across a wide geographic area (e.g., one ormore of a city, state or country). If the process is distributed, eachelement 100 or 200 would be connected over a WAN or a secure internetconnection to a remote central processing system running on a server andable to handle numerous document authentication requests simultaneously.

Also, the central data warehouse 150 as presented herein is intended toapply to any system, source or type of electronic data that issearchable or accessible by one or more computers and/or computerexecutables, and is not intended to be limited by any particularhardware or software implementation. The central data warehouse 150 maybe implemented in centralized or distributed fashion (e.g., as acollection of one or more computer or server systems in accord withvarious models and design methodologies for achieving varyingoperational and functional purposes. Furthermore, the central datawarehouse 150 may be managed by a management system 145, wherein varioushardware, software and network system configurations may be employed.Storage mediums upon which the central data warehouse 150 may beimplemented or maintained may include, but are not limited to, diskstorage such as DASD, RAID, or other mediums of varying volatility. Thecentral data warehouse 150 may be implemented upon such mediums inaccord with varying database file structures, languages ormethodologies, including but not limited to Structured Query Language(SQL), Extensible Markup Language (XML), ordered/unordered flat files,Indexed Sequential Access Method (ISAM), heaps, hash buckets orQuaternary trees (B+ Trees). Those skilled in the art will select thecombination of hardware and software according to their architecturalrequirements.

FIGS. 3 a and 3 b, which highlight an example of a process flow fordocument authentication, are now explained. FIG. 3 a illustrates thesteps associated with collecting data associated with a document. Instep 505, a print file is generated which contains the contents to beprinted on the stock or added to pre-printed stock as maybe the case fordocuments such as certificates, deeds or other similar documents. Keydocument contents are extracted from the print data and merged withmetadata. The content data alternately maybe extracted from the sourcedocuments used to generate the print file. This information is thenprinted on stock that has characteristics that enable uniqueidentification. An alternate approach for content extraction is used ifthe data was not obtained from electronic files (step 510). If contentdata is not available (step 515) the document will be imaged and thecontent is extracted using optical character recognition techniques. Ineither case the document is scanned to acquire the unique stockidentification (step 520). The combined data of unique stock ID 130,content data 132 and metadata 134 are compiled by the central datawarehouse management system 145 and stored in the central data warehouse150 for later recall for recognition and authentication of the document(step 525).

FIG. 3 b illustrates the steps associated with authenticating a documenton subsequent observation. In step 540, the authentication job is set upusing metadata such as the type of document (deed, certificate, will orother high value documents) and the date when the document was created.The setup needs to collect and enter sufficient data for the centraldata warehouse management system 145 to locate the correct data filethat contains the data which relates the document or group of documentsto be authenticated. Selecting the correct file depends on how themetadata becomes available. However, if insufficient metadata isavailable to identify the correct file, a broad search for the recordsin the central data warehouse 150 will be done to acquire the correctfile as part of step 550 after the document is scanned. In step 545, thedocument is imaged with sensor 220 and then the image is processed toextract the content. In addition, the stock ID sensor 222 is used toobtain the stock unique identifier. With both the content data and thestock ID, the central data warehouse management system will search aspecific group of files, if metadata was available or the whole centraldata warehouse 150 to find a match to the document, step 550. If a matchwas found for the stock identifier (step 555) and for the content (step560), the document can be certified as authentic (step 565). Otherwise,if the stock identifier matches, but the content does not match theoriginal document, the document may have been modified and maybe a fraud(step 575). Similarly, if the content matches, but the stock identifierdoes not match, the document is a copy and, therefore, can not beauthenticated (step 575).

FIGS. 4 in association with FIG. 5 illustrate the process of preparingmailpieces that contain identifiable paper stock, such as a document ofone or more pages, a coupon or plastic card, and for creating amailpiece. One objective of this process is to enable association of amailpiece insert 305, which may be a coupon 310, 311 or a plastic cardsuch as a credit card 312, driver's license 313 or other high valueinserts, with the addressee and/or address on the mailpiece. The processallows for the association of the addressee and/or address with theinsert 305 FIG. 5 when it is present for use. For example, this includesauthenticating the name and address on a driver's license 313 orauthenticating the name on a credit card 312. Both authentications areperformed by scanning the stock ID of the presented item against data inthe central data warehouse 150. If a match is found between the stock IDand the database, metadata can be retrieved containing such items asname, address, security questions, DOB and other information that thoseskilled in the art deem useful. Another objective is to recognize acoupon 310, 311 based on matching the stock ID for the coupon againstthe central data warehouse 150. By utilizing a central data warehouse150 to access the addressee and insert identification data, based on thestock ID match, the addressee that used a coupon can be determined atthe redemption center or at the point of sale. If data is collected atthe point of sale (POS), each POS will be connected to the central datawarehouse management system so that the stock identification can be madeand collected metadata can be returned. The resulting data can be usedfor marketing research and to verify the coupon owner, the value of thecoupon service, point of sale, item purchased and any other data thatthose skilled in the art may find useful for analysis. These data itemsmaybe added to the central data warehouse 150 as metadata for lateranalysis or compiled into a separate printable report or added toseparate data structure.

The exemplary computer processing architecture of FIG. 4 can beconfigured in numerous ways without affecting the concept functionality.The key processors are the Data Center Processor 300 which is the sourceof print files 320 that are used to control the document printer 322 andto provide inserter control data 301 to the mail processing system 350control and computer 414 of FIG. 5. The data center processor 300 alsocommunicates 302 with the central data warehouse management system 145to provide document and insert metadata, addressee data and capturedunique stock ID 130 and content data 130. Alternately, the document andinsert content data may come directly from data files in the data centerprocessor 300. The extractor module 140 may be a separate processorwhich is used to read the content information 132 that is derived froman imaging sensor 120 and to read the stock ID with the appropriatesensor such as a RFID analysis tool 122. The identification code isprocessed by the extractor module 140 to produce the stock ID 130. Theinserts 305, coupons 310, 311, plastic cards 312 and driver's licenses313 are all created by a separate process (not shown) where informationis printed on paper stock or plastic material that can be uniquelyidentified by the appropriate sensor 122. One method is to embedconductive fibers in the material that can then be read with an RFIDsensor to produce a unique identification number or identificationsignature that will not be repeated in a prescribed period of time, asdefined by the postal authority or business process. For example, valuedocuments may have to be unique for hundreds of years while coupons mayonly require months of uniqueness. The inserter control computer 414will control the addition of the inserts to a given document thatcontains both custom printed material 330 and inserts both of which arestuffed into the same envelope 362. Alternately, the finished envelope361 may contain only coupons. The net result is finished mailpieces 360that are provided to the postal authority for delivery to a postalcustomer.

The exemplary process steps are as follows. The data center processor300 provides a print file 320 to the printer 322 to control content andaddressee printing. If document identification is required the paperstock will contain unique identification features. When the document isprinted 324 imaging 120 and stock ID 122 sensors verify the printing,capture content and addressee, and associate the printed document withits stock identification using the extractor module 140. This data willbe provided to the central data warehouse management system 145 where itis correlated with additional data from the mail processing system 305which is derived during the mailpiece production (FIG. 5). The data isstored in the central data warehouse 150 for later reuse. Processingsteps as described herein may be adapted accordingly by those skilled inthe art.

FIG. 5 illustrates the processing steps associated with the utilizationof stock identification to track items in a mail processing environmentthrough a coupon redemption center. In some cases the redemption centermay not process the physical coupons, but use a distributed processwhich uses point of sale devices to recognize the coupon type and readthe unique stock ID If the document which is included in the envelopewith tracked coupons or tracked plastic cards is being tracked, aprocess similar to that described for FIGS. 1 and 2 will be utilized. Ina mail processing system 350, such as an inserter, one or more analysistools, sensors, or a suite of various sensors/tools, depicted as 120,122, 416, 404-404 n and the like may operate upon a document beingprocessed. The analysis tools may be positioned inline at various pointsalong the inserter 350 for analyzing the documents in real-time, oralternatively offline for post-inserter processing analysis. Forexample, the analysis tools 120, 122, 416 and 404-404 n are high speedimaging devices (e.g., readers, cameras, etc.) for acquiring and/orinterpreting the content markings that appear on a scanned document andhigh speed paper stock or plastic card stock identification sensors.Coupled to the inserter 350 is a control computer 414, which may providea user interface that enables an operator of the inserter 350 tointeract with inserter control software that runs the inserter.Alternatively, the inserter control computer 414 may also be coupled toan extractor module 140, which may be further coupled to additionalanalysis tools—i.e., high resolution cameras and radio frequencyanalysis devices 120, 122, 416, 404-404 n—for detecting stock datarespective to hardcopy documents being processed. Those skilled in theart will recognize of course, that various implementations may beemployed other than that depicted herein. It should be noted that theinserter computer 414 is typically in control of all elements of theinserter 350 so that assembly of the finished mailpiece 360 is correctlypreformed and each step is verified as it is accomplished. The insertercontrol computer 414 tracks the location of all the material that isbeing assembled to form a mailpiece 361, 362. As a result the componentsof the mailpiece can be tracked to the address and addressee andassociated with the insert type that was added to a mailpiece by eachinsert feeder 402-402 n. The stock ID detectors (scanners) 404-404 nobtain the stock ID for each insert as it is added to the material beingassembled for a mailpiece. The insert type, stock ID and addresseeand/or address are collect with the inserter control computer 414, theextractor module 140 and transferred to the central data warehousemanagement system 145 for storage in the central data warehouse 150.

As a first exemplary point of observation, the analysis tool/sensingdevice may observe a document as it is engaged in front-end inserterprocessing activities. Such activities may include loading the printedmaterial 330 into the document input section 400 of the inserter 350,wherein the printed material may be cut or folded accordingly toconstruct a document of desired size. Generally, the roll of paper isprinted in advance by one or more printer modules (not shown) to displaythe various objects and/or characters that comprise the human or machinereadable content of the document. In the case of a camera being employedas the analysis tool 120, image data pertaining to the document may becompiled and translated into content data by the extractor module 140.Likewise, a radio frequency analysis system (stock ID sensor 122 coupledwith the extractor module 140) may be utilized correspondingly foracquiring stock identification data. An extractor module 140 may beintegrated with and/or communicable with the suite of analysistools/sensors 120 and 122. As before, stock and content data may bepersistently stored by the extractor module 140 during the time ofdocument analysis. This data is then aggregated and packaged into a datastructure, which may subsequently be analyzed against data maintained inthe central data warehouse 150.

Also, as indicated before, various content data elements of interest mayinclude word count per page, tab spacing and indentation lengths, marginlengths, number of paragraphs, number of lines, character and/or objectcoordinate information, and any other data descriptive of the physicalappearance of the hardcopy document. Fold and/or cut line location datamay also be stored, such as by determining the distance from an edge ofthe paper to a point of contact with a cutter as measured from an imagedepicting this point of contact. Stock data associated with thestructural composition of the document may include radio frequency dataas emitted by intentionally embedded conductive fibers which willreradiate a unique signature when interrogated by a RFID sensor.Alternatively, reflectance and contrast data, paper density, or papertexture information may also be employed as stock identification data.Also, in association with the stock and content data, the extractormodule 140 may compile metadata information created by the imagingdevice 122 as it processes a document. In particular, the metadata mayinclude timestamp information, machine ID, machine location componentsassembled into a mailpiece, etc. By associating the metadata with thestock and content or insert type data collected during inserterprocessing, a historical account of the activities involving document orinsert is maintained.

The data collection process continues at the other analysis points alongthe inserter 350, including during accumulation and merging of thevarious inserts 305 with a document and envelope insertion—as performedby the transport 401 and insert feeders 402-402 n and envelope inserter405. In the case of accumulation and document merging and envelopeinsertion by the envelope feeder 418, this involves the association ofvarying inserts 305 with a given document being transported through thetransport system in order to compile a distinct mailpiece. For example,when the inserts are one or more coupons 305, different documentsintended for differing recipients may require different coupons (i.e.,target marketing). One or more analysis tools 404-404 n may be employedfor performing analysis upon documents at this stage of inserterprocessing. These devices may be physically placed in proximity to theinsert feeders 402-402 n so as enable acquisition and/or extraction ofcontent and stock identification data pertinent to the inserts beingmerged with an associated document. In this way, a correlation betweenthe document being processed plus addressee through the inserter 350 andthe inserts may be achieved, which may provide further tracking oranalysis implications.

Further stock and content data may be acquired and/or identified at theoutput system 406 of the inserter 350. Still further, stock and contentdata may be extracted by the extractor module 140 at the point ofprocessing by other devices 408, including those for applying postagemarks, printer marks, address data, labels or other physicalmanipulations to the hardcopy document. In the case where the mailpiecewill contain only coupons 361, the address is printed at section 408 inthe inserter. The address and addressee are associated with the insert305 stock identifications for the items in the envelope. Inline devicesmay include, but are not limited to, postage meter systems, postageapplication devices, printers, or labelers. In some instances, theseother inline devices may be designated as an analysis tool, and thus maybe integrated with an extractor module 140 for enabling the generationof stock and content data. For example, a postage meter enabled with asensor 120, 122 connected to the extractor module 140 could recordpostage affixed data as applied to a document as stock and content data.Doing so creates an additional audit trail that could be useful for theuser/operator or postal authority in reconciling postage paymentdiscrepancies. Such content and or stock data may be acquired throughusage of a sensor or sensor suite 416 placed in proximity to said inlineprocessing devices, which are themselves generally positioned prior toentry of the finished mailpieces 360 into the envelope stacker 412.

Tracking of documents from printed stock 330 and inserts 305 into aspecific envelope with a known addressee and address is required foraccurate performance of the concept. This function is performed with theinserter control computer 414 in conjunction with the inserter controlfile 301 which specifies how the mailpiece is to be assembled. Adocument is identified by sensors 120 and 122 when it is received in thedocuments input section 401 and then tracked through each step of theinsertion process. When the document reaches the first insert feeder402, the inserter control computer 414 will determine if that insert isrequired. If it is required the sensor 404 will read the stock ID andassociate it with the known contents of that insert feeder. This data isappended to the document data of address and addressee plus metadata ifavailable. This process is repeated at each insert feeder until the lastfeeder is reached 402 n. Therefore, when the documents and inserts reachthe envelope inserter 405, the exact contents is known plus the stock IDfor each item contained in the envelope. The resulting data file is sentto the central data warehouse management system 145 from a combinationof the extractor module 140, inserter control computer 414 and the datacenter processor 300 as dictated by the specific design. The data isstored in the central data warehouse 150 for later usage.

Alternately, if only coupons are being inserted into an envelope, thedocument input section 400 is not required. The inserter controlcomputer 414 will track each insert that is added to a group of insertsas the groups are moved through the transport 401. Hence, when the groupreaches the envelope inserter 405, the contents of each coupon is knownalong with its stock identification that was read by each detector 404through 404 n. Since the address and addressee is not yet associatedwith the envelope, tracking of the envelope with its known contents mustcontinue until the address and addressee are printed on the envelope atsection 408. Alternate configurations of the mail processing system 350are common such as replacing the envelope feeder 418 and envelopeinserter 405 with a warping system that manufactures the envelope duringproduction. At this point all information is known and transferred tothe central data warehouse 150.

Attention is now directed towards the central data warehouse managementsystem 145. Once the final document is complete—i.e., the coupons 361are assembled for delivery or an envelope 362 containing a document andinserts—it is ready for distribution to the intended recipient orcustomer 420. When the customer utilizes the coupons 305R at aparticipating store 425, the coupon is collected at the store, andfurther redeemed via a redemption center 430. The redemption center 430may use a distributed process to collect redemption data at the store425 using a POS device 426 equipped with a stock ID sensor 122. A couponidentification sensor also is required which may include an imagingsystem or a barcode reader. This approach allows for collection ofadditional data in regard to the sale and saves the effort of sendingthe coupon to the redemption center. As an added feature, coupon reusecan be prevented by not allowing a coupon to be reused once the stock IDhas been associated with redemption. The redemption center may beequipped with the same types of analysis tools for acquiring stock andcontent data as described above. Hence, the stock and content data isstored as a data structure by an extractor module 140 operable inconnection with the redemption center 430. This data is then transmittedto the central data warehouse management system 145 (e.g., internal orexternal transmission).

The central data warehouse management system 145 extracts the datapopulating each field of the data structure, performs anydecomposition/formatting of the data if required, then checks thecentral data warehouse 150 to determine if it matches any existing stockand content data. The match determination process, as recognized bythose skilled in the art, may be executed using varying types ofmatching algorithms and/or logical instructions. Furthermore, the matchdetermination process may be performed in accord with match sensitivitysettings so as to enable high-confidence or threshold based (e.g.,specified percentage match) evaluation of the stock and content dataagainst data within the minutiae database. For example, if the matchthreshold/sensitivity is set to 75%, then a stock and content data setmatching less than 75% of any other data sets within the database wouldbe considered a non-match. Suffice to say, any effective or known meansof match determination processing is within the scope of the teachingsherein.

In transmitting the stock and content data to the central data warehousemanagement system 145, it may be compared to determine if it matches anyexisting stock and content data previously associated with the documentvia a document identification value. If a match is determined, anidentification alert may be transmitted to the error tracking or fraudprevention group of the redemption center 430. Additionally, the data onrecord may be updated to include additional stock and content data notpreviously identified (e.g., a pen mark applied by the recipient to thephysical document 502), as well as the updating of any metadata (e.g.,time stamp data, analysis tool ID data, recipient ID data). The centraldata warehouse may be implemented via a server, wherein all documentidentification values and their associated stock and content data and/ormetadata information is stored.

Referring now to FIG. 6 which is an exemplary flow cart of a couponredemption system where coupons are assembled into the mailpiece 361.The document processing system inserting device is initially setup withthe data and material needed for operation in step 605. The coupons areloaded into their respective feeders and address data to be printed onthe envelopes is loaded into the printer. The coupons maybe loaded aspackages of pre-processed groups and fed as a group into the insertingdevice. In this case, each group is identical and the stock ID for eachcoupon is pre-scanned. The necessary data file also is provided as partof setup. During the inserter production run (step 610), the mailpiececontent of coupons is assembled from the pre-processed groups, if used,and from the insert feeders 420-402 n. All feeders can add a coupon toeach mailpiece or only selected feeders can be used depending on theinserter control file instructions. The stock ID detectors 404-404 n areused to record the stock ID of each coupon as it is fed. The insertercontrol computer 414 tracks each group of coupons as it moves throughthe inserting device. A temporary document ID is often created to aid inassociation of the data with each group of coupons that are beingcreated. In step 615, coupons are inserted into an envelope and theaddressee and address are printed on the envelope. The temporarymailpiece ID is used to aid in the data association with the list ofcoupon types and stock IDs plus address and/or addressee information.Having both address and addressee data is the most useful for theeventual market data compilation, but in some instances, the addresseeis often identified only as “resident”. In this case the address is theprimary means of identification of the coupon user. The complete datafor each mailpiece is transferred to the central data warehousemanagement system 145 for storage. The mailpiece is then delivered bythe postal service (step 620).

Continuing with FIG. 6, a customer receives the mailpiece and selectscoupons to redeem at the store 425 (step 625). If the store does nothave a point of sale terminal which is equipped to identify the coupontype, i.e. read the coupon barcode, and read the coupon stock ID (step630), the coupon must be forwarded to the redemption center, step 645.If the POS is equipped with the necessary scanners, the coupon type andstock ID can be read and the data associated with the sale also can berecorded at the store (step 635). A coupon validity check can be made tosee if the coupon has already been used for redemption. The dataassociated with the transaction is compiled and transferred to theredemption center (step 640). In the case where the coupons are receivedat the redemption center 420, the coupon type, stock ID and validitymust be checked (step 650). Finally, the coupon type and stock ID dataare used to obtain a match with the stored data in the central datawarehouse by matching systems in the central data warehouse managementsystem 145. When a match occurs the addressee and/or address data can beassociated with the coupon data and POS data to build a productmarketing profile of the person and/or persons at the residence (step655).

Data processing—i.e., stock and content data or metadata collection—isperformed by an extractor module 140, an executable module integratedwith and/or communicable with a process, device or utility (e.g.,software, hardware, or firmware processes or tools) capable of operatingupon a hardcopy document. The extractor module 140 operates to extractstock and/or content data made available by hardcopy documents.Moreover, the extractor module 140 is deployable for independentoperation upon or integration with the various devices or utilitiesusable for analysis of hardcopy documents. In this way, a plurality ofextractor modules may relay information to each other if necessaryand/or communicate with a central data warehouse management system 145.In addition, the extractor module 140 may also communicate with theparticular device, tool (e.g., software) or process it is operating inassociation, i.e., to provide tracking information or ID notificationdata.

The central data warehouse management system 145 is a device (e.g.,server), executable module or process that analyzes document stock andcontent data provided by an extractor module 140 in the form of a datastructure. In other instances, the central data warehouse managementsystem 145 communicates relevant information pertaining to a document tothe extractor module 140. In general, the central data warehousemanagement system 145 processes the various fields of the data structurein order to access the data contents therein, and then executes acomparison of the document stock and content data received againstexisting document stock and content data stored to a central datawarehouse 150 to determine if it is associated with a particulardocument identification value. Suffice to say, when and extractor module140 is integrated with a document processing medium (e.g., a printer,document authoring software, high-speed inserter device), printstreammanagement medium (e.g., printstream creation software) or analysis tool(e.g., imaging device, spectrometer) that operates upon the document,the extractor module 140 may access key information representative ofthe unique elements and features of the document.

When documents such as the stock certificate 110 are printed from acomputing device 100 by a printing device 105, various types of analysistools may be employed for processing the document to obtain unique stockand content data. As a first type of analysis, a high resolution imagingdevice and integrated radio frequency analysis tool 122 may be used toperform stock analysis 216 of the printed document. The stock analysismay include analysis of the fiber structure in high fiber content paper,analysis of the paper density that naturally occurs when the paper pulpis compressed, or analysis of paper textual features that may beintentionally introduced into the paper such as RFID (radio frequencyidentifier) fibers. In performing the analysis, the entire document maybe analyzed, or alternatively, a specific region-of-interest of thedocument may be analyzed.

The latter increases the speed and efficiency of the analysis process,while the former increases the number of unique stock data pointscapable of being generated. However, those skilled in the art willappreciate that from an internal microscopic level of perception, eventwo documents appearing identical physically (e.g., same content,layout, formatting, typesetting) will differ greatly structurally evenif compared against one another at a limited region-of-interest. Assuch, the analysis tool need only observe a limited sample of thedocument—i.e., analyze the rightmost bottom region of the document towithin a rectangular region of 0.25×0.25 inches. Alternatively, theregion-of-interest need not be symmetrical, but rather asymmetrical(e.g., a region enclosed by a freeform object) as defined by theoperator of the analysis tool. In either way, restricting the fibercomposition analysis to a smaller defined region-of-interest greatlyincreases the rate of processing of documents for performing suchanalysis, and enables feasibility of implementation within residential,commercial and industrial settings.

Another type of analysis of the document 118, 218 for collecting contentdata may be conducted using an imaging device 120. Exemplary imagingdevices 120 for collecting content data may include, but are not limitedto, scanners, optical readers, cameras, copy machines, fax machines,etc. An image of the hardcopy document may be analyzed using resolutionimaging and magnification techniques to reveal unique content datapoints characteristic of the original document 118, as depicted withrespect to the composite image. Document content data collected by theextractor module 140 operating in association with the imaging device120 may include, but is not limited to: word count per page or per theentire document, tab spacing and indentation lengths, margin lengths,paragraph numbers, header/footer locations, image locations, linenumbers, line spacing, character and/or font spacing, number ofcharacters with and without spaces, textual color properties, textstring and character coordinate information, paper stock, papertype/dimensions, and other such data descriptive of the physicalcharacteristics of the various objects and/or characters that appear onthe hardcopy document. Also, in association with the document stock andcontent data, the extractor module 140 may compile metadata informationcreated by the imaging device as it processes the document 218. As willbe apparent to those skilled in the art, the stock and content datacollected by imaging the hardcopy document to much an extent mirrors thestock and content data collected. It will be seen later on that this isan intentional feature of the present example, for enabling advancedtracking and linking of the hardcopy version of a document to itsoriginal electronic representation or representation derived from andimage and history data (via the assigned document identification value).

Those skilled in the art will recognize that various other tools notexpressly presented herein may also be utilized during the firstobservation stock and content collection phase 52 for characterizing thephysical and structural qualities of the document. For example, OCRtechnology may be employed for interpreting the plurality of markingsresident upon a document, where the results of the interpretation may befurther employed as stock and content data. Such analysis may beemployed on a case-by-case basis, however, given that no single markingis sufficient in and of itself to uniquely identify a document fromamongst a myriad of possibilities. The interpretation of a singleelement of content (e.g., words, text strings, barcodes) of a documentdoes very little to enable one to identify a specific instance of adocument against even numerous photocopied versions thereof having thesame identical content. Indeed, practitioners of the art may employtheir own suite of sensors or analysis tools for processing of documentsin accordance with their own requirements.

In an effort to further enhance data processing rates for the abovedescribed analysis tools select stock and content data of interest needonly be stored into the data structure 224. In particular, only thestock and content data most pertinent to characterizing the physical(e.g., text coordinates, word counts) and structural composition of thedocument (e.g., microscopic/macroscopic, fiber, chemical) within theregion-of-interest need be compiled. Of course, the number of datapoints, measurements or calculations retained as stock and content datamay be customized to fit specific processing environments,organizational capabilities or user needs. In this way, the analysistools may be adapted accordingly to ensure higher scan rates, samplingspeeds, timing settings, and signal processing for analysis of thesamples under analysis.

The data structure for aggregating the stock and content data may thenbe communicated via a network connection to the document minutiaeprocessing module (not shown), which may reside locally in proximity tothe analysis tool via a local server or at a remote server or location.

In the illustrated examples, computers or servers such as 145, 140, 300are intended to represent a general class of data processing devicecommonly used to run programming. Such a device typically utilizesgeneral purpose computer hardware to perform its respective serverprocessing and to control the attendant communications via thenetwork(s). Each such server, for example, includes a data communicationinterface for packet data communication. The server also includes acentral processing unit (CPU), in the form of one or more processors,for executing program instructions. The server platform typicallyincludes program storage and data storage for various data files to beprocessed and/or communicated by the server, although the server oftenreceives programming and data via network communications. The hardwareelements, operating systems and programming languages of such serversare conventional in nature, and it is presumed that those skilled in theart are adequately familiar therewith.

In the illustrated examples, user terminal devices are generallyillustrated as personal computers (PCs) or the like. Such devices areintended to represent a general class of data processing device commonlyused to run client software and various end-user applications. Thehardware of such personal computer platforms typically is generalpurpose in nature, albeit with an appropriate network connection forcommunication via the intranet, the Internet and/or other data networks.As known in the data processing and communications arts, each suchgeneral-purpose personal computer typically comprises a centralprocessor, an internal communication bus, various types of memory (RAM,ROM, EEPROM, cache memory, etc.), disk drives or other code and datastorage systems, and one or more network interface cards or ports forcommunication purposes. Of course, a personal computer or other end userdata device will also have or be coupled to a display and one or moreuser input devices such as alphanumeric and other keys of a keyboard, amouse, a trackball, etc. The display and user input element(s) togetherform a user interface, for interactive control of the computer andthrough the computer to control other mail processing operations. Theseuser interface elements may be locally coupled to the computer, forexample in a workstation configuration, or the user interface elementsmay be remote from the computer and communicate therewith via a network.The hardware elements, operating systems and programming languages ofsuch end user data devices are conventional in nature, and it ispresumed that those skilled in the art are adequately familiartherewith.

Aspects of the methods outlined above may be embodied in software, e.g.in the form of program code executable by the or other programmabledevice. Such software typically is carried on or otherwise embodied in amedium or media. Terms such as “machine-readable medium” and“computer-readable medium” as used herein generically refer to anymedium that participates in providing instructions and/or data to aprogrammable processor, such as the CPU of a server or end user datadevice or in any of the computers controlling various mail processingequipment, for execution or other processing. Such a medium may takemany forms, including but not limited to, non-volatile storage media,volatile storage media, and transmission media. Non-volatile storagemedia include, for example, optical or magnetic disks. Volatile storagemedia include dynamic memory, such as main memory or cache. Physicaltransmission media include coaxial cables; copper wire and fiber optics,including wired and wireless links of a network and the wires thatcomprise a bus within a computer or the like. Transmission media,however, can also take the form of electric or electromagnetic signals,or acoustic or light waves such as those generated during optical, radiofrequency (RF) and infrared (IR) data communications. Hence, commonforms of machine-readable media include, for example, a floppy disk, aflexible disk, a hard disk, a magnetic tape, any other magnetic medium,a CD or CDROM, a DVD or DVD-ROM, any other optical medium, punch cards,paper tape, any other physical medium with patterns of holes, a RAM, aPROM, an EPROM, a FLASH-EPROM, a cache memory, any other memory chip orcartridge, a carrier wave transporting data or instructions, physicallinks bearing such a carrier wave, or any other medium from which acomputer or the like can read in order to read or recover carriedinformation.

Various forms of machine-readable media may be involved in carrying oneor more sequences of one or more instructions to a processor forexecution. For example, all or portions of the software may at times becommunicated through the Internet, an Intranet, a wireless datacommunication network, or various other telecommunication networks. Suchcommunications, for example may serve to load the software from anothercomputer (not shown) into the server or other platform(s) that serve asthe data engine.

While the foregoing has described what are considered to be the bestmode and/or other examples, it is understood that various modificationsmay be made therein and that the subject matter disclosed herein may beimplemented in various forms and examples, and that the teachings may beapplied in numerous applications, only some of which have been describedherein. It is intended by the following claims to claim any and allapplications, modifications and variations that fall within the truescope of the present teachings.

1. A method of preparing a document for later authentication, thedocument printed on identifiable stock, the method comprising steps of:acquiring stock identification data from a printed hardcopy of thedocument by a first sensor coupled with document processing equipment;obtaining content data for the document; associating the content datawith the stock identification data; and storing the content data andstock identification data in a database.
 2. The method according toclaim 1, wherein the obtaining step includes obtaining content data fromelectronic data or from an image of the printed hardcopy of the documentby a second sensor coupled with the document processing equipment. 3.The method according to claim 1, wherein the acquiring step includesacquiring the stock identification data by way of the first sensorselected from a coherent light beam interrogator, a RFIDinterrogator/analysis tool, or a high magnification imaging system. 4.The method according to claim 1, wherein the acquiring step includesacquiring stock identification data selected from embedded conductor orsemiconductor devices, material deposited or printed on the document, orfibers embedded in the document.
 5. The method according to claim 1,wherein the obtaining step includes obtaining the content data by thesecond sensor selected from an imaging system coupled with opticalcharacter recognition and symbol/picture analysis features.
 6. Themethod according to claim 1, wherein the document processing equipmentincludes a scanner, copier, facsimile device, or kiosk.
 7. The methodaccording to claim 1, wherein the storing step includes storing thecontent data and stock identification data in a data storage medium andfile structure capable of storing searchable data accessible locally orover WAN.
 8. A method of authenticating a document printed onidentifiable stock, the method comprising steps of: acquiring stockidentification data from a printed hardcopy of the document by a firstsensor coupled with document processing equipment; obtaining contentdata from an image of the printed hardcopy of the document by a secondsensor coupled with the document processing equipment; comparing thecontent data and stock identification data with associated content dataand stock identification data stored in a database; and returning anauthentication result indicating whether or not the content data andstock identification data matched with the stored content data and stockidentification data in the database.
 9. The method according to claim 8,wherein the acquiring step includes acquiring the stock identificationdata by way of the first sensor selected from a coherent light beaminterrogator, a RFID interrogator/analysis tool, or a high magnificationimaging system.
 10. The method according to claim 8, wherein theacquiring step includes acquiring stock identification data selectedfrom embedded conductor or semiconductor devices, material deposited orprinted on the document, or fibers embedded in the document.
 11. Themethod according to claim 8, wherein the obtaining step includesobtaining the content data by the second sensor selected from an imagingsystem coupled with optical character recognition and symbol/pictureanalysis features.
 12. The method according to claim 8, wherein thedatabase further includes metadata associated with the document, themetadata including information selected from a document creation dateand document classification information.
 13. The method according toclaim 8, wherein the document is selected from a stock certificate,will, contract, check, mortgage or coupon.
 14. A method of generating aplurality of mailpieces containing inserts on document processingequipment for later authentication of the inserts, the method comprisingsteps of: associating addressee and/or address data with each of aplurality of inserts printed on identifiable stock; acquiring stockidentification data from each of the plurality of inserts with a sensor;obtaining insert classification data for the plurality of inserts;storing the associated address and/or addressee data, acquired stockidentification data and obtained insert classification data in adatabase; and generating the mailpieces containing the insert on thedocument processing equipment.
 15. The method according to claim 14,wherein the obtaining step includes obtaining insert classification froma control system of the document processing equipment or a second sensorwith imaging, optical character recognition or barcode readingcapability.
 16. The method according to claim 14, wherein the storingstep includes storing the address and/or addressee data, stockidentification data and insert classification data in a data storagemedium and file structure capable of storing searchable data accessiblelocally or over WAN.
 17. The method according to claim 14, furthercomprising a step of delivering the mailpieces to the address and/oraddressee listed on each respective mailpiece.
 18. The method accordingto claim 14, wherein the plurality of inserts are selected fromredeemable coupons, credit cards or driver's licenses.
 19. The methodaccording to claim 14, wherein the document processing equipment is aninserter, scanner, copier, facsimile device, or kiosk.
 20. A method ofauthenticating a mailpiece insert printed on identifiable stock, themethod comprising steps of: obtaining stock identification data from themailpiece insert; comparing the stock identification data withassociated stock identification data stored in a database; gatheringaddress and/or addressee data stored in the database based on a resultof the comparing step; acquiring insert classification data associatedwith the plurality of mailpieces from the database; and compiling areport associating the insert classification data with the obtainedaddress and/or addressee data.
 21. The method of claim 20, wherein thecompiling step includes compiling a marketing report including marketinginformation regarding the address and/or addressee.
 22. The methodaccording to claim 20, wherein the obtaining step includes obtaining thestock identification data by way of a sensor selected from a coherentlight beam interrogator, a RFID interrogator/analysis tool, or a highmagnification imaging system.
 23. The method according to claim 20,wherein the obtaining step includes obtaining stock identification dataselected from embedded conductor or semiconductor devices, materialdeposited or printed on the document, or fibers embedded in thedocument.